Semi-supervised Feature Selection via Rescaled Linear Regression

نویسندگان

  • Xiaojun Chen
  • Guowen Yuan
  • Feiping Nie
  • Joshua Zhexue Huang
چکیده

With the rapid increase of complex and highdimensional sparse data, demands for new methods to select features by exploiting both labeled and unlabeled data have increased. Least regression based feature selection methods usually learn a projection matrix and evaluate the importances of features using the projection matrix, which is lack of theoretical explanation. Moreover, these methods cannot find both global and sparse solution of the projection matrix. In this paper, we propose a novel semi-supervised feature selection method which can learn both global and sparse solution of the projection matrix. The new method extends the least square regression model by rescaling the regression coefficients in the least square regression with a set of scale factors, which are used for ranking the features. It has shown that the new model can learn global and sparse solution. Moreover, the introduction of scale factors provides a theoretical explanation for why we can use the projection matrix to rank the features. A simple yet effective algorithm with proved convergence is proposed to optimize the new model. Experimental results on eight real-life data sets show the superiority of the method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On The Semi-Supervised Joint Trained Elastic Net

The elastic net (supervised enet henceforth) is a popular and computationally efficient approach for performing the simultaneous tasks of selecting variables, decorrelation, and shrinking the coefficient vector in the linear regression setting. Semi-supervised regression, currently unrelated to the supervised enet, uses data with missing response values (unlabeled) along with labeled data to tr...

متن کامل

Graph Laplacian for Semi-supervised Feature Selection in Regression Problems

Feature selection is fundamental in many data mining or machine learning applications. Most of the algorithms proposed for this task make the assumption that the data are either supervised or unsupervised, while in practice supervised and unsupervised samples are often simultaneously available. Semi-supervised feature selection is thus needed, and has been studied quite intensively these past f...

متن کامل

A graph Laplacian based approach to semi-supervised feature selection for regression problems

Feature selection is a task of fundamental importance for many data mining or machine learning applications, including regression. Surprisingly, most of the existing feature selection algorithms assume the problems to address are either supervised or unsupervised, while supervised and unsupervised samples are often simultaneously available in real-world applications. Semi-supervised feature sel...

متن کامل

Hypergraph Spectra for Semi-supervised Feature Selection

In many data analysis tasks, one is often confronted with the problem of selecting features from very high dimensional data. Most existing feature selection methods focus on ranking individual features based on a utility criterion, and select the optimal feature set in a greedy manner. However, the feature combinations found in this way do not give optimal classification performance, since they...

متن کامل

Semi-supervised Feature Selection via Spectral Analysis

Feature selection is an important task in effective data mining. A new challenge to feature selection is the socalled “small labeled-sample problem” in which labeled data is small and unlabeled data is large. The paucity of labeled instances provides insufficient information about the structure of the target concept, and can cause supervised feature selection algorithms to fail. Unsupervised fe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017